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^ Abstract 

O We present the development and validation of a new multivariate jet identification algorithm 

^ Cb tagger") used at the CDF experiment at the Fermilab Tevatron. At collider experiments, b tag- 

^ gers allow one to distinguish particle jets containing B hadrons from other jets. Employing feed- 

I— I forward neural network architectures, this tagger is unique in its emphasis on using information 
from individual tracks. This tagger not only contains the usual advantages of a multivariate tech- 

^ nique such as maximal use of information in a jet and tunable purity/efficiency operating points, 

D but is also capable of evaluating jets with only a single track. To demonstrate the effectiveness of 



the tagger, we employ a novel method wherein we calculate the false tag rate and tag efficiency as 
a function of the placement of a lower threshold on a jet's neural network output value in Z + 1 jet 



^ and tt candidate samples, rich in light flavor and b jets, respectively. 
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^ 1. Introduction 

The identification of jets originating from b quarks is an important part of many analyses at 
^ high-energy physics colliders. Searches for the Higgs boson and measurements of top-quark prop- 
^ erties depend on the ability to identify & jets properly. Furthermore, in many new physics models, 
the third generation holds a special role, and therefore final states with l?-quark jets are common. 
The high momentum of B hadrons coupled with their long lifetimes result in a large distance be- 
tween the interaction point and the decay vertex of the B hadron (decay length). Additionally, a 
significant fraction (»20%) of B hadrons decay with a soft lepton, i.e., a charged lepton with a few 
GeV of momentum. These qualities are key to distinguishing &-quark jets from other types of jets. 



'Corresponding author 

Email address: wittichOcornell . edu (P. Wittich) 

^Present address: Lawrence Berkeley National Laboratory, Berkeley, CA, 94720 



Preprint submitted to Nuclear Instruments and Methods A 



October 25, 2011 



Almost all information as to whether or not a given jet originates from a 5-hadron decay is 
carried in the tracks its charged particles leave in the detector. There are a few salient features of 
5-hadron decays which can be searched for via the tracks in a jet. The lifetime of a 5'' {B-, At) 
hadron is 1.52 ps (1.64 ps, 1.42 ps). The distances these particles travel during their lifetimes can 
be resolved by the CDF tracking system, and it is therefore possible to identify the delayed decay 
of a 5 hadron through the displacement of individual tracks with respect to the primary interac- 
tion point (the primary vertex) and also through the combining of tracks in the form of a fitted 
secondary decay vertex. Due to the large mass of the b quark, the decay products of B hadrons 
will form a larger invariant mass than those of hadrons not containing b quarks. Furthermore, the 
large relativistic boost typical of a 5 hadron will result in decay products which tend to be more 
energetic and coUimated within a jet cone than other particles. Finally, particle multiplicities tend 
to be different for jets containing 5-hadron decays compared to other jets; in particular, muons and 
electrons appear in approximately 20% of jets containing a B hadron, typically either directly via 
semileptonic decay of the B or indirectly through the semileptonic decay of a D or resulting 
from a B decay. 

Many algorithms used at CDF were instrumental in the 1995 discovery of the top quark [HI. 
Here we review the standard Z^-tagging algorithms used at CDF. Similar techniques as those de- 
scribed in this paper have been developed at the DO experiment All and at the CMS and ATLAS 
experiments at the LHC [i3l|4l|. 

SecVtx [5] is a secondary vertex tagger. It is the most commonly used b tagger at CDF. Using 
only significantly displaced tracks that pass certain quality requirements within each jet's cone, an 
iterative method is used to fit a secondary vertex within the jet. Given the relatively long lifetime 
of the B hadron, the significance of the two-dimensional decay length L^y in the r-(p plane is used 
to select Z^-jet candidates. The algorithm can be performed with different sets of track requirements 
and threshold values. In practice, three operating points are used, referred to as "loose", "tight", 
and "ultra tight". 

The jet probability [|6l tagger on the other hand does not look for a secondary vertex, but 
instead uses the distribution of the impact parameter significance of tracks in a jet, where impact 
parameter significance is defined as the impact parameter divided by its measured uncertainty 
(Jo/crfo)- By comparing these values to the expected distribution of values from light jets, it is 
possible to determine the fraction of light jets whose tracks would be more significantly displaced 
from the primary vertex than those of the jet under study. While light-flavor jets should yield 
a fraction uniformly distributed from to 1, due to the long B lifetime, b jets often produce 
significantly displaced tracks and hence tend toward a fraction of 0. Although this algorithm 
produces a continuous variable for discriminating bjets, in practice only three operating points are 
supported (jet probability < 0.5%, 1%, and 5%). 

Soft-lepton taggers [7J take a different approach to b tagging. Rather than focusing on tracks 
within a jet, they identify semi-leptonic decays by looking for a lepton matched to a jet. The 
branching ratio of approximately 10% per lepton makes this method useful; however, if used 
alone, this class of tagger is not competitive with the previously mentioned taggers. Because a 
soft-lepton tagger does not rely on the presence of displaced tracks or vertices, it has a chance to 
identify b jets that the other methods cannot. In practice in CDF only the soft muon tagger is used 
since high-purity electron or tau identification within jets is very difficult. 
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Neural networks (NNs) can use as many flavor discriminating observables as is computation- 
ally feasible; hence the efficiency of NN taggers is often equal to or greater than that of conven- 
tional taggers for a given purity. One such NN-based algorithm at CDF, called the "KIT flavor 
separator" [8J, analyzes Sec Vtx-tagged jets and identifies secondary vertices that are likely from 
long-lived B hadrons, separating them from jets with secondary vertices that originate from charm 
hadrons or that are falsely reconstructed. This flavor separator has been used in many CDF anal- 
yses, notably in the CDF observation of single top quark production Another NN-based 
algorithm, the "Roma tagger" [fTOl[TT]| . has been used at CDF in light Higgs searches. While the 
SecVtx tagger attempts to find exactly one displaced vertex in a jet, the Roma tagger uses a ver- 
texing algorithm that can find multiple vertices, as may be the case when multiple hadrons decay 
within the same jet cone (for example, in a B ^ D decay). Three types of NNs are used: one to 
distinguish heavy from light vertices, another to distinguish heavy-candidate from light-candidate 
unvertexed tracks, and a third that takes as inputs the first two NN outputs along with other flavor 
discriminating information, including SecVtx and jet probability tag statuses, number of identified 
muons, and vertex displacement and mass information. The performance of the Roma tagger is 
roughly equivalent to SecVtx at its operating points but allows for an "ultra loose" operating point 
yielding greater efficiency, useful in certain analyses. 

In this paper we describe a new tagger that builds on the development of these taggers us- 
ing feed-forward NN architectures. The NNs provide the ability to exploit correlations in many 
variables. The tagger is unique in its emphasis on individual tracks, and in its ability to evaluate 
jets with only a single track. Each track's potential for having come from a 5-hadron decay is 
evaluated by a NN, and the outputs of this NN are fed into a jet- wide NN along with other jet 
observables such as the significance of the displacement of the secondary vertex. The output of 
this NN, which we call the jet ^ness, is designed to identify jets containing a 5-hadron decay. The 
continuity of the NN output value allows for a tunable operating point corresponding to the desired 
purity and efficiency. 

To characterize the tagger's performance, the efficiency and mistag rate are obtained as a func- 
tion of the jet Z7ness cut inZ+l jet (rich in light flavor jets) and ?f (rich in jets) candidate samples. 
This choice of data samples differs from many previous evaluations of performance using generic 
di-jet samples. The large data sample accumulated at the Tevatron allow us to use the more pure 
top quark samples for b tagging efficiency studies. The ultimate use of this tagger is aimed at 
searches for standard model dibosons and Higgs bosons. The momentum spectrum of b quarks in 
top pair production is better matched to these searches than the relatively soft quark momentum 
spectrum found in generic di-jet samples. Finally, since our tagger will incorporate information 
from many different tagging methods, techniques that, for instance, use soft lepton-tagged jets as 
an input to an efficiency measurement for displaced- vertex taggers cannot be used. 

2. The CDF Detector 

The CDF II detector is described in detail elsewhere [fT2|. The detector is cylindrically symmet- 
ric around the proton beam line|^with tracking systems that sit within a superconducting solenoid 

^The proton beam direction is defined as the positive z direction. The polar angle, 6, is measured from the origin 
of the coordinate system at the center of the detector with respect to the z axis, and cp is the azimuthal angle. Pseu- 
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which produces a 1.4 T magnetic field aligned coaxially with the pp beams. The Central Outer 
Tracker (COT) is a 3.1 m long open cell drift chamber which performs 96 track measurements in 
the region between 0.40 and 1.37 m from the beam axis, providing coverage in the pseudorapdity 
region |?7| < 1.0 [il3il . Sense wires are arranged in eight alternating axial and +2° stereo "super- 
layers" with 12 wires each. The position resolution of a single drift time measurement is about 
140 nm. 

Charged-particle trajectories are found first as a series of approximate line segments in the 
individual axial superlayers. Two complementary algorithms associate segments lying on a com- 
mon circle, and the results are merged to form a final set of axial tracks. Track segments in stereo 
superlayers are associated with the axial track segments to reconstruct tracks in three dimensions. 

The efficiency for finding isolated high-momentum tracks is measured using electrons from 
W- e-v decays identified in the central region \ri\ < 1.1 using only calorimetric information 
from the electron shower and the missing transverse energy. In these events, the efficiency for 
finding the electron track is 99.93^^35%, and this is typical for isolated high-momentum tracks 
from either electronic or muonic W decays contained in the COT. The transverse momentum 
resolution of high-momentum tracks is dpr/Pj « 0.1% (GeV/c)"^. Their track position resolution 
in the direction along the beam line at the origin is 6z « 0.5 cm, and the resolution on the track 
impact parameter, the distance from the beam line to the track's closest approach in the transverse 
plane, is 6do « 350 //m. 

A five layer double-sided silicon microstrip detector (SVX) covers the region between 2.5 to 
1 1 cm from the beam axis. Three separate SVX barrel modules along the beam line cover a length 
of 96 cm, approximately 90% of the luminous beam interaction region. Three of the five layers 
combine an r-cp measurement on one side and a 90° stereo measurement on the other, and the 
remaining two layers combine an r-cp measurement with small angle stereo at +1.2°. The typical 
silicon hit resolution is 1 1 pm. Additional Intermediate Silicon Layers (ISL) at radii between 19 
and 30 cm from the beam line in the central region link tracks in the COT to hits in the SVX. 

Silicon hit information is added to COT tracks using a progressive "outside-in" tracking al- 
gorithm in which COT tracks are extrapolated into the silicon detector, associated silicon hits are 
found, and the track is refit with the added information of the silicon measurements. The initial 
track parameters provide a width for a search road in a given layer. Then, for each candidate hit 
in that layer, the track is refit and used to define the search road into the next layer. This stepwise 
addition of precision SVX information at each layer progressively reduces the size of the search 
road, while also accounting for the additional uncertainty due to multiple scattering in each layer. 
The search uses the two best candidate hits in each layer to generate a small tree of final track 
candidates, from which the tracks with the best;^'^ are selected. The efficiency for associating at 
least three silicon hits with an isolated COT track is 91 + 1%. The extrapolated impact parameter 
resolution for high-momentum outside-in tracks is much smaller than for COT-only tracks: 30 pm, 
including the uncertainty in the beam position. 

Outside the tracking systems and the solenoid, segmented calorimeters with projective ge- 



dorapidity, transverse energy, and transverse momentum are defined as 77=- lntan(0/2), Ej-E sm6, and pr-psinO, 
respectively. The rectangular coordinates x and y point radially outward and vertically upward from the Tevatron ring, 
respectively. 
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ometry are used to reconstruct electromagnetic (EM) showers and jets. The EM and hadronic 
calorimeters are lead- scintillator and iron-scintillator sampling devices, respectively. The central 
and plug calorimeters are segmented into towers, each covering a small range of pseudorapidity 
and azimuth, and in full cover the entire In in azimuth and the pseudorapidity regions of |77|<1.1 
and 1.1<|77|<3.6 respectively. The transverse energy Ej, where the polar angle is calculated using 
the measured z position of the event vertex, is measured in each calorimeter tower. Proportional 
and scintillating strip detectors measure the transverse profile of EM showers at a depth corre- 
sponding to the shower maximum. 

High-momentum jets, photons, and electrons leave isolated energy deposits in contiguous 
groups of calorimeter towers which can be summed together into an energy cluster. Electrons 
are identified in the central EM calorimeter as isolated, mostly electromagnetic clusters that also 
match with a track in the pseudorapidity range |?7| < 1.1. The electron transverse energy is recon- 
structed from the electromagnetic cluster with precision cr{ET)/ET = 13.5%/ ^/E^(GeV) © 2%, 
where the © symbol denotes addition in quadrature. Jets are identified as a group of electromag- 
netic and hadronic calorimeter clusters using the jetclu algorithm [14J with a cone size of 0.4. Jet 
energies are corrected for the calorimeter non-linearity, losses in the gaps betwen towers, multiple 
primary interactions, the underlying event, and out-of-cone losses [ 15]. The jet energy resolution 
is approximately ctet = 1-0 GeV -I- 0.1 x Ej 

Directly outside of the calorimeter, four-layer stacks of planar drift chambers detect muons 
with pt > lA GeV/c that traverse the five absorption lengths of the calorimeter. Farther out, be- 
hind an additional 60 cm of steel, four layers of drift chambers detect muons with pj > 2.0 GeV/c. 
The two systems both cover a region of |?7| < 0.6, though they have different structure and their 
geometrical coverages do not overlap exactly. Muons in the region between 0.6 < \rj\ < 1.0 
pass through at least four drift layers lying in a conic section outside of the central calorimeter. 
Muons are identified as isolated tracks in the COT that extrapolate to track segments in one of the 
four-layer stacks. 

3. Description of the neural network 

All neural networks are trained using simulated data samples. The geometric and kinematic 
acceptances are obtained using a GEANx-based simulation of the CDF II detector [fT6l. For the com- 
parison to data, all sample cross sections are normalized to the results of NLO calculations per- 
formed with the MCFM v5.4 program [fTTl and using the cteq6m parton distribution functions [il8L 

3.1. Basic track selection 

A great deal of information as to whether a jet contains a 5-hadron decay is contained within 
the jet's individual tracks. Indeed, as described earlier, the jet probability algorithm [[6l uses in- 
formation solely based on the significance of the impact parameters of tracks. Furthermore, an 
important choice to make when seeking displaced vertices is which tracks to use as candidates for 
a fit. In light of this, our tagger takes a ground-up approach where the first step in the evalua- 
tion of how Z7-like a jet is involves using a neural network to discriminate 5-hadron decay tracks 
from other tracks in a jet. We use relatively loose criteria when selecting which tracks to evaluate 
with our track-by-track NN, thereby improving the Z?-tagging efficiency. We reject tracks that use 
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hits only in the COT, as the COT alone has insufficient resolution to distinguish the effects of the 
displacement of a 5-hadron decay from the primary vertex. Additionally, a track must have a 
Pt > 0.4 GeV/c, a requirement CDF maintains for all tracks, and be found within a cone of IsR < 
0.4 about the jet axis, where ISR = -sjiAcf))^ + {hjff. Finally, for tracks within a jet, track pairs 
are removed if they are oppositely charged, form an invariant mass within 10 MeV of that of a 
Ks (0.497 GeV/c^) or A (1.115 GeV/c^), and can be fit into a two-track vertex. This requirement 
is included to reject non-b jets that contain these long-lived particles, as they can mimic b jets, 
compromising our purity. 

3.2. The track neural network 
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Figure 1: Inputs used in the neural network for calculating the per-track bness. The red dashed line is signal and the 
black solid line is background. The >'-axis is in arbitrary units. 



The two primary categories of input variables to the track-by-track NN are observables related 
to the displacement of the track from the primary vertex and observables related to the kinematics 
of the track. The former category includes the track's signed impact parameteij^ (do), its z dis- 
placement (zo) from the primary vertex, and the significances of these two quantities, given their 
uncertainties (Jo/ct^/q and zo/cr^J. The latter category takes advantage of the fact that tracks from 
5-hadron decays have a somewhat harder pj spectrum than other tracks, and are more coUimated 
within a jet. This category includes the track's pr, its pseudorapidity (//axis) with respect to the jet 
axis, and its momentum (pperp) perpendicular to the jet axis. 

A final input variable to the track-by-track Z7ness NN is the Ej of the jet, since distributions 
of the track observables are correlated with their parent jet Et- To ensure that the distributions of 



We define the signed impact parameter of a track as positive if the angle between the candidate b-jet direction 
and the line joining the primary vertex to the point of closest approach of the track to the vertex is less than 90°, and 
as negative otherwise. 

6 



track observables used to train the track-by-track NN are not kinematically biased, B hadron and 
non-5 hadron tracks are weighted in training to have the same parent jet Et distribution. 

Figure [T] shows distributions of the track variables in pythia [19] ZZ jjjj Monte Carlo 
simulations (MC) for tracks matched by AR < 0.141 to particles that come from 5-hadron decays 
compared to tracks in jets which are not matched to B hadrons. These figures indicate that the 
displacement variables tend to give more discrimination power than the kinematic variables; in 
particular, the impact parameter variables are the most important inputs to the NN. 

The NN is a feed-forward multilayer perceptron with a single output and two hidden layers of 
15 and 14 nodes implemented using the MLP algorithm from the TMVA package [20]. The same 
number of signal and background events was used in the training. The performance of the NN was 
similar with larger numbers of hidden layer nodes. 

3.3. The jet neural network 

To determine how Z7-like a jet is, we train a NN to distinguish jets containg 5-hadron decays 
from those not containing 5-hadron decays. Many of the input variables come directly from the 
track-by-track NN described in the previous section: the NN values of the five most Z^-like tracks 
{bi, i = 0..4), as well as the number (ntrk) of tracks with a NN output greater than 0. 

We use tracks with track-by-track NN values greater than -0.5 in the fitting of a secondary 
vertex. An initial fit is performed with all such tracks; if the largest contribution to the total fit;^^^ 
from any of them exceeds a value of 50, it is removed, and the remaining tracks are re-fit. This 
process continues until either the largest contribution from any track is less than 50, or there are 
fewer than two tracks to be fit. If a secondary vertex is successfully fit, then the significance of its 
displacement from the primary vertex {L„,/crL,J and the invariant mass (mvtx) of the tracks used to 
fit it both serve as inputs into the NN. 

Additionally, because a much higher fraction of b jets than non-b jets contain Ks particles, 
the number of Ks candidates found is used as an input to the jet-by-jet NN. Finally, if there is a 
muon candidate in the jet cone, its likelihood to be a true muon is used as an input. This value is 
calculated using the soft muon tagger Q described above. The architecture of the jet-by-jet NN is 
similar to that of the track-by-track NN, with two hidden layers of 15 and 16 nodes. As in the track 
NN, to avoid a kinematic bias, the parent jet Et distributions are weighted to be equal and also 
input into the NN. Distributions of the most important jet-by-jet NN input variables are shown in 
Figure [2} Distributions of the NN output are shown in Figure [3] 

The training for the track NN as well as the jet NN is performed using jets, from a pythia ZZ 
MC sample, matched to b quarks from Z ^ bb events for signal and jets not matched to b quarks 
for background. 

4. Selection for Mistag Rate and Efficiency Determination 

In order to use this new b tagger in analyses, we determine the efficiency and false tag ("mistag") 
rate as a function of a minimal bness requirement, e{b) and m(b) respectively. We use comparisons 
between data and Monte Carlo simulation to evaluate these quantities and their uncertainties. Also, 
we evaluate the efficiency and mistag rate in Monte Carlo (eMc(^) and mucib), respectively), and 
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Figure 2: The most relevant inputs used in the neural network for calculating the per-jet bness. The red dashed line is 
signal and the black solid line is background, bi refers to the bness of the /* track, ordered in bness. The y-axis is in 
arbitrary units. 




jet bmss 



Figure 3: Output of the final neural network, for signal (red dashed line) and background (black solid line). Good 
separation is seen with the exception of signal (and background) peaking near a jet bness of -0.8. This region is 
dominated by jets with zero tracks having positive track bness, zero Ks candidates found, and no secondary vertex. 
Indeed, some b jets are indistinguishable from non-b jets. The sharp features of this distribution are a result of the 
discrete inputs to the NN. 
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Z + 1 jet Selection 



Meptons = 2, both electrons or both muons 
Leptons have opposite charge 
Azo between leptons < 5 cm 
Lepton pt > 20 GeV/c 
75 GeV/c^ < Mn < 105 GeV/c^ 
< 25 GeV 
Reconstructed priZ) > 10 GeV/c 
A^jets(£r > 10 GeV) = 1 
Jet Et > 20 GeV, \r]\ < 2.0 



tt Selection 



leptons 



1 



Lepton Pt > 20 GeV/c 
Pt > 20 GeV 
^7-significance > 1(3) for p.(e) events 
Reconstructed Mt(W) > 28 GeV/c^ 
Highest two i»ness jets' Et > 20 GeV 
A^jets(£r > 15 GeV) > 4 
Total sum Et > 300 GeV 



Table 1: Summary of event selection requirements for the Z + 1 jet and tt samples. The total sum Ej- is defined as the 
sum of the lepton pr, M^j, and Ej of all jets with Ej > 15 GeV. 





Electrons 


Muons 


Z + 1 jet selection 


Data Events 


9512 


5575 


MC Events 


9640 + 880 


5540 + 490 


tt Selection 


Data Events 


507 


835 


MC Events 


542+ 56 


862+ 85 



Table 2: Number of events in data and MC in the Z + 1 jet selection region, after proper scale factors have been 
applied. The uncertainties on the MC reflect only the two dominant systematic uncertainties: the uncertainty on the 
jet energy scale and the uncertainty on the luminosity. Overall, the agreement in number of events is good. 



determine the necessary scale factor, s^ib) = e(b) / eucib) (with a similar definition for the mistag 
rate), to correct the simulation. 

Following the procedure described in Appendix A and [Appendix B[ we must choose two 



independent regions in which to determine the mistag rate and efficiency of the b tagger. To 
reduce uncertainties, it is best to choose a well-modelled region dominated by falsely tagged jets 
(where we expect few b jets) and a well-modelled region rich in b jets. For the former, we choose 
events containing two oppositely charged electrons or muons likely from the decay of a Z boson, 
plus one jet. For the latter, we choose events containing the decay of a pair of top quarks, where 
we require exactly one lepton, at least four jets, and a large imbalance in transverse momentum in 
the event, indicating the likely presence of a neutrino. We expect that the two jets with the highest 
bness values in this sample will very likely be b jets. The cuts applied for these two selection 
regions are described in Table [T] We use the Pt significance, as defined in [|2Tll22ll . to reduce any 
contribution from multi-jet production where a jet is mis-identified as an electron or muon|^ 
These events are selected by high-pr electron and muon triggers. We use data corresponding 



* We define the missing transverse momentum - 2, E^iii, where n,- is the unit vector in the azimuthal plane 
that points from the beamline to the rth calorimeter tower We call the magnitude of this vector . The significance 
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to an integrated luminosity of 4.8 fb"^ We use alpgen [|23ll . interfaced with pythia for parton 
showering, to model W and Z plus jets samples and pythia to model tt and other processes with 
small contributions. We check the trigger efficiency against a sample of Z ^ e'^e' or fx'^fx' events 
without jets. Table [2] contains a summary of the total number of events. 

5. Mistag Rate Determination 

Figure |4] shows the jet bness distribution for jets in the Z + 1 jet sample. The sample is 
dominated by light-flavor jets, but there is a significant contribution of real b jets at higher Z^ness 
values, coming from Z + bb production. This is seen more clearly in Figure [sj where we separate 
the MC jets based on whether there are generator-level b quarks located within each jet's cone 
{AR = 0.4). Also shown is the b-jet purity (A'^b-jets/A'jets) as a function of lower threshold on jet 
bness. We see the b-jet incidence rate reaches above 60% for the highest bness cuts, and thus we 
will expect the uncertainties in the mistag rate to be substantially higher there, due to both the 
small sample of available jets and the high contamination rate combined with the uncertainty on 
the number of b jets in that smaller sample. 

The mistag rate for jets above a given bness threshold is simply the fraction of non-Z? jets above 
that threshold. To obtain this quantity, we use the fraction of jets in data above that threshold 
(mraw(^)), but must correct this quantity for the expected number of b jets in our Z -I- 1 jet sample. 
We obtain an estimate of this b jet contamination from MC simulation, and obtain the corrected 
mistag rate, m(b). We show the values of m(b) as well as the relative diff"erence between the mistag 
rate in data and MC {Sm{b) - 1) in Figure [6j 

We can also calculate the uncertainty on the mistag rate given the error on the Z?-tagging effi- 
ciency and the uncertainty on the fraction of ^ jets in our Z-i- 1 jet sample. The former is determined 
through iterative calculations incorporating the tt selection, while the latter we take to be 20% [|24|. 
The resulting uncertainties are also shown in Figure [6| 

6. Tagging Efficiency Determination 

We use our tt selection, described in Section |4] and Table [Tj to calculate the efficiency from a 
sample of jets with high b purity. As these events have many jets, we order the jets by decreasing 
bness value. This mirrors the procedure in a related analysis using this b tagger [|25l and provides 
values for the Z?-tagging efficiency while accounting for this sorting procedure. Figure |7] shows 
the jet bness distributions in data and MC for the two jets with highest bness in each event. The 
agreement here is very good, and regions of high taess are almost exclusively populated by tt 
events, indicating that our b tagger is properly identifying b jets. We check that the purity of b 
jets as a function of the cut on the jet bness in these distributions is also high by splitting jets into 
matched and non-matched categories (Figure [8]), as done for the Z -I- 1 jet selection described in 
Section|5| We see that the b-jet purity of the tt sample is rather high, even for low bness thresholds. 



is a measure of the ratio of the value of to its uncertainty, and tends to be small for due to mismeasurement 
rather than due to undetected, long-lived neutral particles such as neutrinos. 
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CDF Run II,|l = 4.8 fb"^ 

0) F 1 1 1 1 1 1 1 \ 1 1 \ 1 1 1 1 1 1 1 \ ; 




-1 -0.5 0.5 1 

Jet bness 



Figure 4: A comparison of the jet bness in data and MC in the Z + 1 jet selection region. The MC is able to reproduce 
the main features of the bness distribution in data. We use this distribution to determine the mistag rate for placing a 
cut on jet bness in data, and use the differences between data and MC to determine corrections to the mistag rate in 
MC. 



CDF Run II, L = 4.Sib'^ 





0.5 1 

bness Cut 



Figure 5: Left: A comparison of the jet bness in data (black points) and MC (green solid line) in the Z + 1 jet selection 
region, with the portion of the MC jets matched to b quarks (purple dashed line) shown independently. Right: The 
b-jet purity for a given bness cut, as determined from matched jets in the MC. As we wish to use the Z + 1 jet sample 
as a model for mistags, it is necessary to subtract the significant b-jet contribution at high bness values. 
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Figure 6: Left: The mistag rate in data (solid black line, dashed Unes represent uncertainty) and Monte Carlo (dot- 
dashed green line) as a function the jet taess. We see our simulation typically under-predicts the mistag rate measured 
in data, requiring us to consider a correction to apply to the MC. Right: The calculated MC scale factor on the mistag 
rate (solid line) and its uncertainty (dashed lines) relative to the mistag rate in the Monte Carlo. The value of the scale 
factors and their uncertainties at the relevant /jness cuts in this analysis are summarized in Table |3] We see very large 
uncertainties on the mistag rate scale factor around the high jet taess cut of 0.85, due to the small number of events 
and significant heavy-flavor removal that must be done in this region. 
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Figure 7: Jet feness of the first (left) and second (right) jet, as ordered by /jness, in the fflepton -H jets selection region. 
The simulation reproduces most of the features of the data, and we see much of the fe-enriched samples clustered 
towards high Z^ness. 



12 



CDF Run II, L = 4.8 fb ' 




Figure 8: Top Left: A comparison of the highest jet /jness in data (black points) and MC (green solid line) in the tt 
lepton + jets sample, with the portion of the MC jets matched to b quarks (purple dashed line) shown independently. 
Top Right: The ^j-jet purity for a given ^jness cut on the highest jet taess, as determined from matched jets in the 
MC. Bottom Left: A comparison of the second highest jet ^jness in data (black points) and MC (green solid line) in 
the tt lepton + jets selection region, with the portion of the MC jets matched to b quarks (purple dashed line) shown 
independently. Bottom Right: The ^j-jet purity for a given /jness cut on the second highest jet taess, as determined 
from matched jets in the MC. In these plots, we see a high purity in our chosen sample, which is approximately 55% 
events. 
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Figure 9: The efficiency of a bness cut in data (solid black line, dashed lines represent uncertainty) and Monte Carlo 
(dot-dashed green line) as a function of a cut on jet bness for the highest (left) and 2"'* highest (right) Zjness jets in an 
event. We see our simulation typically over-predicts the efficiency measured in data, and thus needs to be corrected. 



We calculate the efficiency of a given bness threshold and its uncertainty in an analogous way 



to the calculation of the mistag rate, described in detail in Appendix B We show the calculated 
efficiencies and uncertainties for the highest and 2"'^ highest Z?ness jets in Figure|9| and we show the 
relative difference between the efficiency in data and MC (the quantity Se{b)-l) and its uncertainty 
in Figure [lOj The relative differences and uncertainties on the efficiency are on the order of 10% 
or less, comparable to the SecVtx b tagger scale factors and their uncertainties. Table |3] lists the 
efficiency and mistag rates in data and MC for a chosen operating point — the highest jet bness > 
0.85, and the 2^^ highest jet bness > 0.0 — along with the relative difference between data and MC, 
and the error on that difference. This choice of operating points is motivated by the optimization 



of a cross section measurement that uses the tagger [25 J. Figure 1 1 shows the relationship between 
the calculated efficiency of identifying jets with a cut on the jet Z^ness and the rejection power of 
that cut for non-Z? jets for the highest and 2"'' highest Z7ness jets in an event. 



Quantity 


bness Cut 


Data 


MC 


% Difference 


% Error 


Mistag Rate 


0.0 


0.0819 


0.0720 


14% 


4.1% 




0.85 


0.00997 


0.00869 


15% 


21% 


Tag Efficiency 


0.0 


0.622 


0.684 


-9.0% 


8.7% 




0.85 


0.652 


0.687 


-5.2% 


6.2% 



Table 3: Mistag rates and efficiencies on jet bness cuts determined from comparisons of data and MC in our Z -i- 1 jet 
and ff control regions. For the bness cut at 0.85, we consider the highest bness jet, and for the bness cut at 0.0, we 
consider the 2"'^ highest Z^ness jet in our tt sample. 



We estimate the performance of the tagger as a function of the jets' transverse momenta and 
pseudorapidity in simulated data of di-jet events, where the jets are b jets. We select the jet with 
the highest bness score and calculate the efficiency for bness > 0.85. These efficiencies are shown 
in Figure [12] The tagging efficiency ranges from 38% at low transverse momentum to more than 
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Figure 10: The difference in efficiency between data and Monte Carlo (center solid line) and its uncertainty (dashed 
lines) relative to the efficiency in the Monte Carlo as a function of the cut on jet bness for the highest (left) and 2"'* 
highest (right) ^ness jets in an event. The value of the scale factors and their uncertainties at the relevant ^jness cuts in 
this analysis are summarized in Table |3] 




b jet Efficiency b jet Efficiency 



Figure 11: Plots of the non-b-jet rejection versus the b-}et efficiency for a range of a cuts on jet bness for the highest 
(left) and 2"'* highest (right) bness jets in an event. 
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Figure 12: Tag performance for the jet with the highest bness score as a function of transverse momentum (left) and 77 
(right) for a bness requirement b > 0.85, derived from simulated data. The tagging efficiency ranges from 38% at low 
transverse momentum to more than 50% at higher momentum. The efficiency is flat in the central region (|7;| < 1.0) 
and drops outside the acceptance of the central part of the tracking system. 



50% at higher momentum. The efficiency is flat in the central region (I77I < 1.0) and drops outside 
the acceptance of the central part of the tracking system. 

While generic comparisons between taggers are difficult, we compare our tagger to the most 
commonly used b tagger in CDF, the SecVtx tagger. The efficiency and mistag rates of our tagger 
compare favorably to the SecVtx tagger. We compare the two taggers using simulated events, 
looking at the two highest bness jets in the MC of our tt selection, and look at require b > 0.85 
for our tagger. The "tight" SecVtx tagger operating point on this sample of jets has an efficiency 
of 0.59 and a mistag rate of 0.052, while the "loose" operating point has an efficiency of 0.68 
with a mistag rate of 0.088. For the highest Z^ness jet cut at > 0.85, we have a efficiency near the 
loose-tag efficiency (0.69), but a lower mistag rate (0.009) than the tight SecVtx tag; for the 2"'* 
highest bness jet cut at > 0.0, we have a similarly high efficiency (0.68) while allowing a mistag 
rate similar to the loose SecVtx tag (0.082). 

7. Conclusion 

We have described a neural network based b tagger in current use at the Fermilab Tevatron's 
CDF experiment. By examining all the tracks associated with jets, this tagger has a larger accep- 
tance than previous neural network based taggers at CDF. Furthermore, the tagger is calibrated 
using data from Z boson decays and events containing top quark pair production — a novel method 
which yields small systematic uncertainties on the tagging efficiency and mistag rate. Finally, the 
utility of this tagger has been demonstrated in a measurement of the ZZ and WZ production cross 
sections [|25l . 
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Appendix A. Evaluation of IMistag Rate and Efficiency 

For any given selection of data, we can calculate the mistag rate (where all non-b jets are 
considered mistags) if we know the number Nb of b jets, the number Nsib) of b jets above the 
threshold bness, the total number of jets, and the total number N{b) of jets above the bness cut 
threshold: 

N(b)-NB(b) 

We may use IVIC to determine the fraction fs of jets that are b jets, and the efficiency eucib) 
for these jets to pass the bness cut. This efficiency may need to be modified by a scale factor 
Se{b) = e(b)/ eucib) if it is different from the true efficiency evaluated in data. Thus, 

Nb = /bN and Nsib) = s,(b)eMc{b)fBN. (A.2) 
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Also, if we define a mistag rate that has not been corrected for the possible presence of b jets in 
the same sample, m^a^(b) = N{b)/N, then we may write equation [A. 1 in the following way: 



, m.^MN - Se(b)eMc(b)fBN 

m(b) = 

N-fsN 

_ m,^v,(b) - sAb)eMc{b)fB 
We can write an analogous expression for the efficiency of b jets passing a given Mess cut: 



(A.3) 



(b) - Sm{b)mMc{b)fL . . 

e(b) = (A.4) 

i - JL 

where e^awib) is a "raw" efficiency uncorrected for the presence of non-b jets in a sample, mucib) 
is the mistag rate as measured in MC, corrected to match data by a scale factor s,„{b), and fi is the 
fraction of hght-flavor (here defined as non-b) jets in the chosen sample. 

Note that the determination of the mistag rate depends on the calculated value of the efficiency 
(through the scale factor term Se{b)), and that in turn the determination of the efficiency depends 
on the mistag rate (again through the scale factor s,n{b)). Similarly, the uncertainties on these 
quantities (see below) depend on each other in a non-linear fashion. Thus, we use an iterative 
procedure to solve for the mistag rate, efficiency, and their uncertainties. We calculate the mistag 
rate first using a value of Se{b) = 1, and find that the values of e(b) and m(b) converge (and their 
uncertainties) very quickly. 

The uncertainties on these quantities may also be calculated from the expressions above. For 
the mistag rate, 

2„.x '«raw(^)(l - mraw(^)) 



+ 



0-e{b)fB 



2 



^ / cTf,[sAb)e(b)-mm \' 
\ 1 - /b / 

The first term is a binomial uncertainty on the raw mistag rate of the sample, and is the term 
related to the statistical uncertainty of the sample used to determine the mistag rate. The second 
term comes from the uncertainty on the measured value of e{b), which can be calculated using a 
similar expression, and is done so iteratively, as cr,„(b) and CTeib) depend on each other. The final 
term is due to the uncertainty on /g, which will depend on the choice of MC and the region in 
which MC and data are compared. A similar expression determines (Te{b). 

Appendix B. Tagging Efficiency Determination 

Similar to our calculation of the mistag rate, we calculate the efficiency observed in data using 



equation [A.4[ Both e^^^{b) and s,„{b)mMcib) = m{b) can be calculated easily by counting events 

18 



above a given bness threshold in the data and MC respectively. Because of the different competing 
processes in our tt sample (there is a significant contribution from W + light flavor jets and W + 
bb processes), it is best to break fi into these most significant subsamples: 

^ Nwjj + Nwhi, + Ntt 

where Nx is the number of events predicted by MC in subsample X, and fl is the fraction of 
non-b jets in subsample X. We assume that the MC correctly reproduces the values of f^. To 
determine Se{b) = e{b)/eMc(b), we write down a similar expression for the efficiency in MC using 
the efficiency of each subsample in MC: 

eucib) = ^ J] emfsNx (B.2) 

where, as before, Nx is the number of events predicted by Monte Carlo in subsample X, /J is the 
total fraction of b jets in subsample X, and ex is the efficiency of b jets passing a particular bness 
cut in subsample X. We assume, again, that the Monte Carlo correctly reproduces the values of 

JB ■ 

Given Equations |B.l and B.2 we modify our equation for determining the uncertainty in the 



calculated efficiency. We obtain the uncertainty by calculating the uncertainty of the quantity 
ie(b) - eucib)), and find 

2/7 \ ^ /^raw(l ~ ^raw) , ^2 

M^) = (^(^^Hc..A) 

+ y ^1 X 

V[A^MC(1-/L)f 

[ie + s^m){fL - fl) +fi{eMc - ex)f (B.3) 

where the latter term represents a sum over each of the MC subsamples. A^mc and Nb are the 
total number of events and events with b jets in the MC, and ctx is the uncertainty assigned to the 
number of events in each MC subsample. Because we compare only the normalizations of data 
and MC in our determination of efficiency (and mistag rate) scale factors, the uncertainty on the 
number of events in each MC subsample need only reflect the relative uncertainty on the fraction 
of events each subsample contributes to the whole. We assign c^^,l,h = 20%, and c^^,jj = 8.72% 
and cr,f = 6.78% based on a fit to the distribution of the sum of the highest two bness jets in tt 
events. 
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